Addressing the ZooKeeper Synchronization Inefficiency
نویسندگان
چکیده
In this paper we discuss the problem of synchronization in ZooKeeper, a fault-tolerant distributed coordination framework. One of the key features of ZooKeeper is to move away from blocking API such as locks, in order to avoid problems with slow or faulty clients. Instead, it provides an event like synchronization mechanism, allowing clients to be notified upon state change on the server. However, such a mechanism leads to very inefficient implementation of synchronization objects such as queues or barriers. We propose a new solution to this problem. The solution is to handle a sequence of client operations completely on the server. This means that the client implements the required sequence of operations as a single request, which is sent to the server for execution via a generic API. We present a prototype that shares some of the concepts of ZooKeeper but, contrary to ZooKeeper, allows a very efficient implementation of synchronization objects. The solution requires a deterministic multi-threaded server, which we implement thanks to a coroutine mechanism. Experiments show the significant gain in efficiency of our solution on producer-consumer queues and synchronization barriers.
منابع مشابه
ZooKeeper: Wait-free Coordination for Internet-scale Systems
In this paper, we describe ZooKeeper, a service for coordinating processes of distributed applications. Since ZooKeeper is part of critical infrastructure, ZooKeeper aims to provide a simple and high performance kernel for building more complex coordination primitives at the client. It incorporates elements from group messaging, shared registers, and distributed lock services in a replicated, c...
متن کاملZooKeeper’s atomic broadcast protocol: Theory and practice
Apache ZooKeeper is a distributed coordination service for cloud computing, providing essential synchronization and group services for other distributed applications. At its core lies an atomic broadcast protocol, which elects a leader, synchronizes the nodes, and performs broadcasts of updates from the leader. We study the design of this protocol, highlight promised properties, and analyze its...
متن کاملTail Latency in ZooKeeper and a Simple Reimplementation
ZooKeeper [1] is a commonly used service for coordinating distributed applications. ZooKeeper uses leader-based atomic broadcast for writes, so that all state modifications are globally totally ordered, but it allows stale reads from any server for high read availability. This design trades high read throughput for potentially high write latency. Unfortunately, the extent of this tradeoff and t...
متن کاملOn the Exploitation of Value Predication and Producer Identification to Reduce Barrier Synchronization Time
Barrier synchronization is a source of inefficiency in many parallel programs, due to the association of many producer-consumer relations in with one synchronization variable. This inefficiency may consume a significant percentage of total execution time, especially as we increase the degree of parallelism while maintaining the problem size. Barrier synchronization wait time can be hidden by sp...
متن کاملRunning ZooKeeper Coordination Services in Untrusted Clouds
Cloud computing is a recent trend in computer science. However, privacy concerns and a lack of trust in cloud providers are an obstacle for many deployments. Maturing hardware support for implementing Trusted Execution Environments (TEEs) aims at mitigating these problems. Such technologies allow to run applications in a trusted environment, thereby protecting data from unauthorized access. To ...
متن کامل